/3-COALESCENTS AND STABLE GALTON- WATSON TREES 



ROMAIN ABRAHAM AND JEAN-FRANQOIS DELMAS 

Abstract. Representation of coalescent process using pruning of trees has been used by 
Goldschmidt and Martin for the Bolthausen-Sznitman coalescent and by Abraham and Del- 
mas for the /9(3/2, l/2)-coalescent. By considering a pruning procedure on stable Galton- 
Watson tree with n labeled leaves, we give a representation of the discrete + a, 1 — a)- 
coalescent, with a G [1/2, 1) starting from the trivial partition of the n first integers. The 
construction can also be made directly on the stable continuum Levy tree, with parameter 
1/a, simultaneously for all n. This representation allows to use results on the asymptotic 
number of coalescence events to get the asymptotic number of cuts in stable Galton-Watson 
tree (with infinite variance for the reproduction law) needed to isolate the root. Using con- 
vergence of the stable Galton-Watson tree conditioned to have infinitely many leaves, one can 
get the asymptotic distribution of blocks in the last coalescence event in the /3(1 + a, 1 — a)- 
coalescent. 



1. Introduction 

1.1. Framework. The idea of constructing coalescent processes by pruning discrete trees 
arises first in [23J where the Bolthausen-Sznitman coalescent is constructed by a uniform 
pruning of the branches of a random recursive tree, see also [30] and [20] for applications of 
such a representation. The same kind of ideas has been used in [3] to construct a /3(3/l, 1/2)- 
coalescent process using a uniform pruning of the branches of a uniform random binary tree. 
This construction is also closely related to Aldous's continuum random tree. The goal of this 
paper is to extend this result by applying a pruning at nodes (introduced in [1] in a continuous 
setting and in [7j in a discrete setting) to a stable Levy tree, obtaining a /3(1 + a, 1 — a)- 
coalescent process, with 1/2 < a < 1. 

Let A be a finite measure on [0,1]. A A-coalescent (Tl(t),t > 0) is a Markov process 
which takes values in the set of partitions of N* = {1, 2, . . .} introduced in [29] for coalescent 
processes with possible multiple collisions. It is defined via the transition rates of its restric- 
tion JJy^ = (nJ- n }(t),t > 0) to the n first integers: if ED n ](i) is composed of b blocks, then k 
(2 < k < b) fixed blocks coalesce at rate: 

(1) X b ,k= [' u k - 2 (l-u) b ~ k A(du). 



o 



In particular a coalescence event happens at rate: 



b % 



A 6 = V A 6 



^ \k 

k=2 



We also define the discrete process ILj?g = (njj^ (k), k € N) as the different successive states of 
the process until it reaches the absorbing state (which is the trivial partition consisting 
in one block) and afterward the discrete process remains constant. 
As examples of A-coalescents, let us mention: 



Date: March 28, 2013. 



1 



2 



ROMAIN ABRAHAM AND JEAN-FRANCOIS DELMAS 



• the Kingman's coalescent with A(dx) = 6o(dx), see |27| . 

• the Bolthausen-Sznitman coalescent with A(dx) = l(Q^(x)dx, see [13] . 

• the /3-coalescents with A(dx) is (up to a multiplicative constant) the /3(a, b) distribu- 
tion. In the case of the j3(l + a, 1 — a)-coalescent, that is A(dx) = (x/(l — x)) a dx, 
see \12\ 110] for — 1 < a < 0. The case a = corresponds to the Bolthausen-Sznitman 
coalescent, while the limit case a = — 1 formally corresponds to the Kingman's coa- 
lescent. For the /3(1 + a, — a)-coalescent, with — 1 < a < see [19] . 

We refer to the survey [11] for further results on coalescent processes. 

Let a S [1/2,1). We consider a Galton- Watson (GW) tree T with offspring distribution 
characterized by its generating function for r £ [0, 1]: 

(3) g(r) = r + a(l-r) 1 /°. 

This GW tree arises as the shape of the sub-tree of a stable Levy tree with index j = 1/a 
generated by leaves chosen in a Poissonian manner, see [16] . Theorem 3.2.1. We shall call 
these random trees the stable GW trees with parameter 7. We denote by P the distribution 
of T. If x is a node of T we denote by k x (T) the number of offsprings of x. Since g'(0) = 0, 
we get that a.s. k x (T) ^ 1 for all x £ T. We denote by P n the law of T conditioned to have 
exactly n leaves (a leaf is a node with no offspring). Under P n , we label the leaves of T from 
1 to n uniformly at random, independently of T, and then we consider the following pruning 
procedure which is derived from [SJ, see Section [2.21 Choose an internal node x\ (which has 
at least 2 children) at random with probability: 

L(T) - 1 ' 

We cut that node and keep the part Tm of the tree that contains the root, and keep x\ which 
is now a leaf of Tn\ and we label x\ by the block (i.e. the sequence) of labels of the leaves 
"above" x\. We then iterate the procedure on the tree Tm and so on until the root is cut 
(see Figure (P)). 

This pruning procedure defines a discrete time process IIq^ = (PIq| v (A;),A: E N) taking 

values in the set of partitions of the n first integers, I1q| v (/c) being the set of labels of the 
leaves of the tree T( fc ) obtained after the A;-th cut. 



1.2. Main result. The process IIq^ is then a coalescent process starting from the trivial 
partition consisting of singletons and blocks merge together as time goes by. Its law is given 
in the next theorem. 

Theorem 1.1. We set a = ± G [1/2,1). The process II is distributed under P n as 11^ 
for the j3{\ + a,l — a) -coalescent with coalescent measure: 

(4) A(dx) = (r^j d x- 

In] 

Notice that the process IT..^ is discrete in time and thus characterizes the coalescent mea- 
sure up to a multiplicative constant. 

One major drawback of this construction is that we define the process for fixed n and not 
simultaneously for all n. However, as in [1], we can construct directly the process (11(9), 8 > 0) 
taking values in the set of partitions of the integers using the pruning of a Levy continuum 
random tree. More precisely, we consider the weighted stable Levy tree (T, d, m ' ) associated 
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[1.2,3,4,5.6,7.8] 



Figure 1. The pruning at node of a given tree. The bold internal node 
corresponds to the next chosen node. 



with the branching mechanism = A 7 for 7 £ (1,2) (the case 7 = 2 is studied in [3] and 
requires a different pruning). We recall that T is a real tree and that correspond to a 
uniform measure on the leaves of T, see [16], [T7] and also [9 J more specifically for the space 
of weighted real trees. We work under the so-called normalized excursion measure under 
which m 7 " is a probability measure. We consider given T the pruning defined in [I]: to each 
branching point x of T we can associate a "mass" A x of this node, which intuitively represents 
the size of its progeny, and a random variable E x which is exponentially distributed with 
parameter A x . This random variable represents the time at which the node x is cut. When 
we cut such a node, we remove the sub-tree above it. Let Te denote the continuum random 
sub-tree obtained at time 8 > 0. We define a coalescent process using the usual paintbox 
procedure. Let {U%,i G N*) be independent random variables with distribution under 
N (1) . We define a partition of N* at time 9, U h ^ vy (9) by saying that two integers i and j belong 
to the same block of Ili^yy (0) if and only if the random variables Ui and Uj have a leaf of Te as a 
common ancestor. Intuitively this means that Ui and Uj belong to the same sub-tree attached 
above Te- This defines a coalescent process IlL^ vy = {Jli,^ y {0), 9 > 0). We are now interested 

in its discrete (in time) restriction to the n first integers. Let nj^ vy = (n{^ (A;), k G N) be 
the discrete process associated with IlL^y restricted to the n first integers until it reaches the 
absorbing state (which is the trivial partition consisting in one block) and which afterward 
remains constant. 

By construction, and thanks to Theorem 3.2.1 in [16], we can deduce that under N^, 

In] In] 

the discrete coalescent process 11^ J is distributed as IIq^ under P n . In fact, we have the 
following stronger result. 
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Theorem 1.2. We set a = - G (1/2,1). UnderN^, the processes (n^l^, n G N*) associated 

with the Levy tree with branching mechanism = A 7 is distributed as (n^,n G N*) 

associated with the Levy measure A(dx) = {x/1 — x) a dx. 

We conjecture that the process ITLevy under N^ 1 ) is up to random time change a /3(1 + 
a, 1 — a)-coalescent. 

Remark 1.3. Lets us remark that the /3(l + a, 1 — a)-coalescent we obtain is also a /3(2 — a,a)- 
coalescent (with a = 1 — a) as in [10] but with a different range for o. The difference between 
the two cases is that in [10] a G (—1,0) and the coalescent process comes down from infinity 
(i.e. for every positive time 8, the partition n(#) contains only a finite number of blocks) 
whereas in our case a G (1/2, 1) the process always contains an infinite number of singletons 
(also called "dust"). 

1.3. Number of cuts need to isolate the root in a stable GW tree. We now give 
an application of our representation using results on /3-coalescent to get the asymptotic 
number of cuts needed to isolate the root in a stable GW tree with n leaves. Notice that the 
reproduction law for stable GW tree has an infinite variance for a G (1/2,1), whereas it is 
finite if a = 1/2. For GW trees with finite variance for the reproduction law, consider the 
cutting procedure given by choosing a node at random and removing the trees attached to 
this node not containing the root. Let Z' n denote the number of cuts needed to isolate the 
root when the GW tree has n leaves. The limit Z' of Z' n is given in [25] for the convergence 
in distribution and in [6] for an a.s. convergence in the case of binary trees. In particular, Z' 
is distributed as the height of a random leaf of the normalized Levy tree with 7 = 2 that is 
up to some scaling factor of the Aldous' continuum random tree. 

Let Z n be the number of cuts, using the procedure developed is Section 11.11 needed to 
isolate the root of a stable GW tree: 

Z n = inf{£;; U l ^(k) = {{1, . . . , n}}}. 

Notice that for r-ary trees, since all the internal nodes have the same degree the cutting 
procedure given in Section II. 1} corresponds to choose an internal node uniformly, which is 
the cutting procedure in [25] . 

From [21 ^ 1241 [22] on the asymptotics of the number of coalescence events in /3-coalescent, 
we can then deduce the following result which extends part of the result in [25] to GW tree 
with infinite variance of the reproduction law. 

Proposition 1.4. Let a = I/7 G [1/2, 1). We have the following convergence in distribution: 

n a-l Zn Z, 
n— >+oo 

with the distribution of Z characterized by, for n G N* .* 

T(n + l)r(l - a) 



E [Z n ] = a r 



r((n + l)(l-a)) 

The distribution of Z corresponds to the expected limit distribution in the Conjecture of 
[3] for the number of cuts needed to isolate the root in general GW trees. (Notice that in 
the conjecture, one choose an internal node x G T with probability proportional to k x (T) 
whereas in Section 11.11 one choose an internal node x G T with probability proportional to 
k x (T) — 1.) In particular, Z is distributed as the height of a random leaf of the normalized 
Levy tree with branching mechanism = A 7 . 

The proof of the Proposition is given in Section [5] 
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1.4. Number of blocks in the last coalescence event. Using the pruning of GW tree 
conditioned to have an infinite number of leaves (which is very close to Kesten result on GW 
tree conditionally on the non extinction) we get the asymptotic of the number B n of blocks 
involved is the last coalescence event of nJ n l . 

The proof of the following Proposition is given in Section (6j 

Proposition 1.5. Let a = I/7 6 [1/2, 1). We have the following convergence in distribution: 

n— >+oo 

ith the distribution of B given by its generating function <p a (r) = E [r B ] , with for r £ [0, 1]; 
(5) Va(r) = (1 - a)r f - ^ — ( - l -— -l) . 



B n > B, 



wi 







1 — (1 — x) a \(1 — rx 



See also [1] for more results in this direction when a = 1/2 including the number of 
singletons involved in the last coalescence event as well as a closed form for 921/2- 

One can also check, using results from Section [6l that in the last coalescence event of the 
n first integers one block is of order n, while the other blocks are of order 1. 

Remark 1.6. For a = (that is let a goes to in ([5])), we get: 

, \ f 1 !og(l - rx) 

fo{r)=r - — - v dx. 

Jo log(l - x) 

Using elementary computations, one can see that 920 ( r ) = r X^ngN* Qn rn with for n £ N*: 
1 " ' dx=-l (l-e- v )' l e- v — = ->;( ;*)(-l)* +1 log(A; + l). 



n Jo log(l - x) 



n Jo v n ^ V^y 



Therefore </?o is the generating function of the asymptotic number of blocks of the last coa- 
lescence event in the Bolthausen-Sznitman coalescent whose distribution is given in Theorem 
3.1 and Proposition 3.2 of |23j. Then notice that the Bolthausen-Sznitman coalescent corre- 
sponds indeed to the /3(1 + a, 1 — a)-coalescent (with coalescent measure given by ((H)) with 
a = 0. 

For a = —1, we get y?_i(r) = r 2 . Notice that ip-± is trivially the generating function of 
the number of blocks of the last (in fact all) coalescence event in the Kingman's coalescent, 
as all the coalescence events are binary. Furthermore, Kingman's coalescent can be seen as 
the limit of the /3(1 + a, 1 — a)-coalescent as a goes down to -1. 

We wonder if (p a is the generating function of the asymptotic number of blocks of the last 
coalescence event in the /3(1 + a, 1 — a)-coalescent (with coalescent measure given by @) 
with a £ (—1,1). Notice also that multiplying the coalescent measure by a constant doesn't 
change the distribution of the number of blocks in the last coalescence event. Notice also 
that </>q(1) = +00 for a > (when the coalescent doesn't come down from infinity) and 
<p' a (l) < +00 for a < (when the coalescent comes down from infinity). 

1.5. Organization of the paper. Section [2] gives a representation of the pruning at node 
procedure for GW tree in continuous time motivated by [8]. This procedure corresponds in 
fact to the one presented in Introduction, Section 11.11 Section [3] is devoted to the proof of 
Theorem 11.11 Section H] devoted to the proof of Theorem 11.21 is more technical as it relies on 
continuum random Levy trees and the pruning of such trees as developed in pp. Eventually 
Sections [5] and [6] are devoted to the proofs of Propositions 11.41 and 11.51 
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2. Pruning at node of discrete GW trees 

2.1. Discrete trees. Let us recall here the formalism for ordered discrete trees. We set 

U=[j (N*) n 

n>0 

the set of finite sequences of positive integers with the convention (N*) = {0}. For u G U 
let \u\ be the length or generation of u defined as the integer n such that u G (N*) n . If u and 
v are two sequences of U, we denote by uv the concatenation of the two sequences, with the 
convention that uv = u if v = and uv = v if it = 0. The set of ancestors of u is the set: 

(6) A u = {v G there exists w £U such that it = mo}. 

A discrete tree t is a subset of W that satisfies: 

• G t, 

• If u G t, then C t. 

• For every u G t, there exists a non- negative integer fc u (t) such that, for all positive 
integer i, ui G t iff 1 < i < k u (t). 

The integer k u (t) represents the number of offsprings of the node u in the tree t. We define 
C(t) the set of leaves of t and A/"(t) the set of internal nodes of t as: 

£(t) = {u G t, k u {t) = 0} and M(t) = t \ £(t). 

Let L(t) = Card (>C(t)) be the number of leaves of the tree t, and notice that: 

(7) L(t)-1= £ (Mt)-l). 

weAf(t) 

We denote by T the set of discrete trees and by T n = {t G T; L(t) = n} the set of discrete 
trees with n leaves. 

2.2. A discrete tree-valued process. We consider the pruning procedure developed in 
|7J. Let t G T. Under some probability measure P*, we consider a family (£ u ,u G IX) 
of independent non-negative real random variables (possibly infinite) such that P t -a.s. for 
u 1 or u G t, such that k u (t) G {0, 1}, £ u = +oo and for u G t such that k u (t) > 2: 

P*(^ >&) = (l + O) 1 -^. 

At time 9, we define the pruned tree Ve(t) as the sub-tree given by: 

Ve(t) = {u G t; ^ > 9 for all v G A u }. 

For u G M(t), let D u be the event that u is marked first, that is: 

D u = {iu = min 
«6AT(t) 

Lemma 2.1. Lei n G A/"(t). M^e /lave: 

P*(£>« 



L(t) - 1 

This lemma implies that the cutting procedure given in Section 11.11 corresponds to the 
successive states of the process (^(t), 9 > 0). 
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Proof. We have, using (|7|) for the last equality: 

P* (D u ) = P*(e« < & V«^u,«€JV(T)) 
— E* (1 + £ u ) _ ^^«,feAr(t)( fe "( t ) _1 ) 

= (jfeu(t) - 1) / (1 + 0)-^^ m (kv(t)-l)-l dQ 

J[0,+oa) 

^ k u (t) - 1 

Si-eA^(t)(^( t ) ~~ 1) 

fcu(t) - 1 

L(t) - 1 ' 

□ 

2.3. Construction of the partition-valued process IIq W . Let a G [1/2,1). Recall that 
function ^ defined by ([3]) is the generating function of a probability measure f 9 on N. We 
denote by G g (dT) the distribution on T of the critical GW tree with offspring distribution 
v g . We will denote by P the probability measure on T x [0, +co] w : 

P(dT,dO =G g (dT)P T (d£). 

Under P, the random tree T is a GW tree whose reproduction law v g has generating 
function g given by ([3]). According to Propositions 2.1 and 3.2 in [8], (Vg(T),9 > 0) is a 
Markov process and Ve(T) is a GW tree whose reproduction law has generating function go, 
with: 

g e (r) = l + (l + 6) 1 ' ' N / ' 

Notice that: 

(8) g d ( r )=r + a 



1 - r + 60 1/Q - e 1 ^ 



(l + 0)(iA*)-i 

For every positive integer n, we set: 

P n (.) = P(- I L(T) = n). 

Under P n , the distribution of the tree T is given by the following formula (see [16], Theorem 
3.3.3, or [28]), for t <G T n : 

(9) p„(T=t)= n! f n rm) - "n' r " 

where p± = and, for k > 2, p^. = |(1 — 7) (2 — 7) • • • (A: — 7)|. 

Let n G N*. Let T be a random tree distributed as Pn' Conditionally on T, we de- 
fine a uniform random labeling U\, . . . ,U n of the leaves of T, independently of the variables 
(Cu,u E T). Recall the set of ancestors defined in © and the pruning procedure Vg in- 
troduced in Section 12.21 We define the equivalence relation on {1,2,..., n} by: iR^j 
if Ay i [\A\j. f]C(V0(T)) is non empty, that is U{ and Uj have a leaf of V~g(T) as common 

ancestor. Then, for every 6 > 0, let n{5w(6>) be the equival ence classes of the equivalence 

relation Hg^ of the n first integers. Let = (U [ Q^(k),k G N) be the discrete process 

associated with AqI^ = (11^1^(9), 6 > 0) until it reaches the absorbing state (which is the 
trivial partition consisting in one block) and afterward the discrete process remains constant. 
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We end this section with an elementary lemma which will be used in the proof of Theo- 
rem [TTT1 



1 — a r (1 — a) r (n — 1 + a) 



Lemma 2.2. We have for n > 2: 



(10) 



En [h(T)-l] 



a T (a) T (n — a) 

Proof. We consider the generating function of (/^(T), L(T)) under P, that is H(s,t) 
E ^s kli ^ T H L ^~\ . Using the branching property of GW trees, we have: 



11 



H(s,t) = E \s k ^B[t L ^ T h {k , T)m } + tP(fc (T) = 0). 



L {M^0}_ 

Notice that g(s) = E [ s k o( T )] = H(s,l). We set h(t) = H(l,t) = E [t L ^] the generating 
function of L(T). So that (fTT]) becomes: 

(12) H(s,t)=g(sh(t))-g(0)(l-t). 
Taking s = 1 in (fT2|h we get: 

(13) g(h(t))-h(t)=g(0)(l-t). 
Using expression ©, we get: 

h(t) = 1 - (1 - t) a and H(s, t) = s h(t) + a(l - s h(t)) l/a - a(l - t). 
We deduce that: 



E 



h(T)t L ^\ = — (l,i) = h(t) - h(t)(l - h(t))W a '>- 1 



E 
E 



[1 - (1 - t) a ] (1 - i) 1 -" 
(1 - 1) 1 -" + 1 - 1. 



This gives: 



For n > 2, we get: 



E 



E[(fc (r)-i)i 



(k § (T) - l)t L ^\ = -(1 - tf- a + l-t. 
1 /d" 



{L(T)=n}\ - ^1 ^ 



-E 



(fe (T)-l)t L ( T ) 



n-2 



-(1-a) JJ(a + fc) 

fc=0 

T(n - 1 + a) 



n! 
1 



n! 



We also get for n > 2: 



l-Q) 



n-1 



r(a) 



P(L(T) = n) = -h^(0) = -aj[(k 
n! n! - LJ - 



1 r (n — a) 
a) = — a- 



k=l 



n\ r(l-a) 



We deduce that: 

F r frm n E [( fc 0( r ) - WiHT)^}] _ 1 - a T(l - q) T(n - 1 + a) 
E n [fc (i ) - lj - p - - — _ r(n-a) ' 



□ 
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3. Proof of Theorem 11.11 

Let a £ [1/2, 1) and A given by ([4]). Notice that the probability that the first coalescence 
event for ITv] corresponds to the collision of k given blocks is \ik/\i> with and A n given 
respectively by (pQ) and ([2]). 

Theorem 11.11 is a direct consequence of Lemma 13.31 which states that the probability that 
the first coalescence event for IIq^ corresponds to the collision of k given blocks is A„ 5 fc/A n , 
and of Lemma 13.41 which states that after the first coalescence event, the law of the pruned 
tree under P n conditionally given that it has k leaves is exactly P^. 

The proof of Lemme [3731 (resp. I3.4|) is given in Section l3~ll (resp. I3.2|) . 



3.1. Computation of the coalescence rates. We first give an intermediate lemma. For 
a £ (0, 1) and A > a — 1, we set: 



(14) 01+a,l-a(A) = £ (l - (1 - x) A ) X a -\l - X)~ a dx. 
Lemma 3.1. For a £ (0, 1) and A > a — 1, we have: 

a, m x r(a)r(A + 1 - a) 

(15) <Pl+a,l-a( A ) = A- 



(l-a)r(A + l) 
Notice that for A > 0, (fT5j) reduces to: 

* m r(a)r(A + i-a) 

(16) «W-«(A) = (1 _ Q)r(A) • 

Proof. We set: 

l 

((l-n)^ Q -l) u a - 2 du. 

o 

Notice that I is finite and <^i +Qj i_ a (a) = /. For A > a, using an integration by part, we 
have: 







(A)= / i-(i-x) A )x a - 2 (l- X y a dx 



£ ((1 - x)- a - 1) x a ~ 2 dx + J (l - (1 - x) 

1 A - « f 1 ,„ sx- a -ln-l 



A— a i 2 



x a - z dx 



+- / (1 - x) A - Q - 1 x Q - 1 da; 



o 



1 — a 1 — a 

1 r(q)r(A + 1 - a) 

~ i -q + (l-o)r(A) 

We now compute /. For A = 1, we also have: 



Pl+a,l-a 

We deduce that: 



(1) = f x a - l {l-x)- a dx = T(a)T(l-a). 
Jo 



1 , r(q)r(2 - a) 
1 — a (1 — a)l (1) 

This readily implies that / = 1/(1 — 0?) and thus (|15|) holds for A > a. Then uses that the 
right-hand sides of (|14p and (|15[) are analytic for A > a — 1 to get that (|15p also holds for 
A > a - 1. □ 
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Recall A n> fc and X n are given respectively by ([1]) and (J2j) , for A given by (JH). 

Lemma 3.2. Let a £ [1/2, 1). We have for 2 < k < n: 

X ri: k 1 — a T(k + a — l)r(n — A; — q + 1) 1 
^ ^ "\T ~ T(a + 1) r(n - a) n-l 

Proof. We have 

A n , fc = f u k - 2 (l - u) n ~ k K{du) = C u k - 2+a (l - u) n - k - a du 



and 



^g(*)W (i - (i 



(3(k + a — l,n — k — a + 1) 
T(k + a - l)r(n - k - a + 1) 



-u) n -n-u(l--u) n - i )n- 2 A(dn) 



Then using notation (|14p and (|16p . we deduce that: 

A n = <^i +a i_ a (n) - n I u a '\l - n)™" 1 -" <2u 



r(a)r(n + 1 — a) T(a)r(n — a) 

(1 - a)r(n) r(n) 

n — a \ T(a)r(n — a) 
n 



i — a / r(n) 

a T(a)r(n — a) 

= (n - 1) r^ — — 

The expression obtained for X n ^ then gives the result. □ 

If ti and t2 are two discrete trees and u £ £(ti) is a leaf of ti, we shall denote by ti ® u t 2 
the tree obtained by grafting the tree t 2 on the leaf u of ti, that is: 

(18) ti t 2 = {v, v £ ti} U {lit;, « £ t 2 }. 

Lemma 3.3. Let a £ [1/2, 1). T/ie probability under P n £/iai i/ie first coalescence event in 
n^jy is i/ie coalescence of k given integers into one block is X n ,k/X n - 

Proof. Let be the event that the first coalescence event corresponds to the k first integers 
merging together. By echangeability, the lemma is proved as soon as we check that P n (Ai t ) = 

X n ,k/ X n . 

The event is realized, if and only if: 

• The initial tree T is of the form ti ® u t 2 for some t 2 £ Tfc and ti £ T n _^. + i and 
u £ £(ti). 

• The leaves of t 2 are labeled from 1 to A; (and therefore, the leaves of ti except u are 
labeled from k + 1 to n). This occurs with probability fc! ^ n ( . 

• The first chosen node of ti ® u t 2 is u. This occurs according to Lemma 12.11 with 
probability ^ff" 1 . 
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Thus, using ([9]) for the probability of having a given tree, we have: 

fc!(n-fc)!fc0(t 2 )-l 



P n (A k )= Yj Pn(T = ti® tt t 2 )- 



n! n — 1 

t l ST n-fc + l 

ue£(ti) 

V n! I TT gMtigutg) | ^""^(l - a) fc!(n - fc)! fc (t 2 ) - 1 
^ \ , ^ Mti© M t 2 )!/ r(n-a) n! n-1 

ue£(ti) 

= (n-* + l) £ ^^ TT y j P n _ fe+1 (T = t 1 )P fe (T = t 2 ) 

tlST n _ fc+1 

t 2 eT fc 

a"-ir(l - a) r(n - fc - a + 1) r(Jfe - a) fc!(n - A;)! fe (t 2 ) - 1 
r(n — a) a"~ fc r(l — a) a fe_1 r(l — a) n! n — 1 

r(w-fc-a + l)r(fc-a) 1 

= 7v yprri \ 7 Efc fc T ~ 1 ■ 

1 (n — ajl (1 — a) n — 1 

We then use Lemma 12.21 and Lemma 13.21 to conclude. □ 

3.2. Law of the tree after the first coalescence event. Let S be the time of the first 
coalescence event and recall that "Pg(T) denote the pruned tree at the first coalescence event. 

Lemma 3.4. Let t G T k - We have: 

(19) Pn(V s (T) = t I L{V S (T)) = k) = P k (T = t). 

Proof. Let t 6 Tfc. We obtain t just after the first coalescence event if T is of the form t ® u s 
for some s G T n _fc+i, u € £(t) and u is the first chosen internal node. This gives: 



.(^(THtH £ Pn (T = t© u s)MfLJ: 
z — ' n — 1 

Pkv(t) -TT Pfe„(s) \ a n_1 r(l - a) fc (s) - 1 



ne£(t) 



* e »■ n n 



v^c) -' (t>! ./m.. fc(s) 7 r< "- Q) 
-* E M(n -1 + i)i p *( r = t ) p -*«( r = " 

sGT n _ fe+1 v y 

a n_1 r(l-a) r(ib-a) T(n - A; + 1 - a) fc (s) - 1 
r(n — a) a fc ~ 1 r(l — a) a n_fc r(l — a) n — 1 
n! r(n - fc + 1 - a)T{k - a) 

~ (k- l)!(n- k + 1)! r(n - a)r(l - a) 

— - E n _ fc+1 [/c (T) - l]P fc (T = t). 
n — 1 

As the term in front of P k (T = t) does not depend on t, it has to be equal to P n (L(Vs(T)) = 
k) and therefore (I19D holds. □ 
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4. Pruning of rooted real trees and proof of Theorem 11.21 

The aim of this section is to use the pruning procedure for Levy trees developed in pQ 
to give a consistent representation of the family of coalescent processes (n[?!y, n G N*), see 
Corollary 14,41 and thus deduce Theorem 11.21 

4.1. The CRT framework. 

4.1.1. Real trees. Real trees have been introduced first in the field of geometric group theory 
(see for instance [2] ) and then used later for defining continuum random trees (the framework 
first appeared in [18]). A real tree is a complete metric space (T,d) satisfying the following 
two properties for every x,y G T: 

• (unique geodesic) There is a unique isometric map f X;V from [0, d(x, y)\ into T such 
that f x ,y(Q) = x and f x , y {d(x,y)) = y. 

• (no loop) If ip is a continuous injective map from [0, 1] into T such that <p(0) = x and 
cp(T) = y, then 

^([0,l]) = Uy(i^d(x,y)]). 

A rooted real tree is a real tree with a distinguished vertex denoted and called the root. 

For every x,y G T, we denote by \x, yj the range of the map f XjV (i.e. the only path in the 
tree that links x to y) and we set [x,y[= {x,yj \ {y}. 

If T is a rooted real tree, for x G T, we define the degree of x, denoted by n x , as the number 
of connected components of T\ {x}. The leaves of T is C{T) = {x G T\ {0}; n x = 1}. If 
n x > 3, we say that x is a branching point of T ■ We denote by £>br(T) the set of branching 
points of T. The height of T is H m&x (T) = sup{d(0,x); x G T}. Let (xi,i G /) be a family 
of elements of T, we define their most recent common ancestor denoted by MRCA(xj,i G J) 
as the element x of T such that [0,x] = Pli e / [0, a^il • 

A weighted rooted real tree (T, d, m) is a rooted real tree (T, d) endowed with a mass 
measure m on T ■ 

4.1.2. Stable Levy tree. Set ip(X) = A 7 with 7 G (1,2). We refer to [T7] and [9] for the 
existence of a measure N[(iT] on the set of weighted locally compact rooted real tree such 
that T is under N[dT] a Levy tree associated with the branching mechanism ip. For the 
Levy tree (T, d, m), N[c?7~] -a.e., the mass measure has support C(T) and has no atom. 
Furthermore, N[(fT]-a.e., all the branching points of the tree are of infinite degree. Following 
|17j . there exists a local time process (£ a ,a > 0) with values on finite measures on T, which 
is cadlag for the weak topology on finite measures on T and such that N^[tf7~]-a.e.: 

/>oo 

m(dx) = / t{dx)da, 
Jo 

1° = 0, inf{a > 0;£ a = 0} = sup{a > 0;f / 0} = #max(T) and for every fixed a > 0, 
N^[cfT]-a.e. the measure £ a is supported on {x G T; d{%, x) = a} and the real valued process 
({£ a , 1), a > 0) is distributed as a continuous state branching process (CSBP) with branching 
mechanism tp under its canonical measure. In particular, as the total size of a critical CSBP 
is finite, we get that N-a.e. a = m(7~) is finite. 

The set {<i(0,x), x G Br(7~)} coincides N^-a.e. with the set of discontinuity times of the 
mapping a 1— > £ a . Moreover, N^-a.e., for every such discontinuity time b, there is a unique 
x G £>br(7~) such that <i(0,x) = b and A x > 0, such that: 

£ b = l b - + A x d x , 
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where A x > is called the mass of the node x. Intuitively A x represent the mass of the 
progeny of x. 

The scaling property of the stable Levy implies that there exists a well defined probability 
measure defined as the measure N conditioned on {a = 1}. The probability measure is 
also referred as the normalized excursion measure for Levy trees. 

4.2. The partition-valued process. Set ip(X) = A 7 with 7 G (1,2). 

4.2.1. Pruning of the stable Levy tree. We consider the pruning procedure introduced in [lj 
(this procedure is defined when there is no Brownian part in the Levy process with index 
given by the branching mechanism if)). Under N or conditionally given T, we consider a 
family (E x ,x G i3b r (7~)) of independent real random variables such that the random variable 
E x is exponentially distributed with parameter A x . This random variable represents the time 
at which the branching point x is marked. For every 6 > 0, we set 

T e = {x£T, VyG [0,x[, E y > 6}. 

The set To is still a real tree which represents the tree T pruned at time 6: we cut T at the 
points that are marked before time 6 and keep the connected component of the tree that 
contains the root. We set 7o = T . By pQ, Theorem 1.5, the tree Te is distributed under N as 
a Levy tree with branching mechanism ipg defined by: 

^(A)=^(A + 0)-^(0). 
Moreover, by [2], the process (Tg,0 > 0) is under N a Markov process. 

4.2.2. Definition of the partition-valued process. Under N or NW, conditionally on T, let 
(Fi,i G N*) be independent random variables on T distributed according to the probability 
mass measure m/m(7~), and independent of the marks (E x ,x G £>br(7~)). Notice that N-a.e. 
or N^-a.s. (Fi,i G N*) are leaves of T ■ For 6 > 0, we define the equivalence relation lZ g vy 
on N* by: iTZ^ 6vy j if [0, H[0, Fjj f] C{T e ) is non empty, that is F< and Fj have a leaf of 

Tq as common ancestor. This is very close to the definition of the equivalence relation 
defined in Section [2T3l We denote by 11^^(6) the partition of N* formed by the equivalence 
classes of lZ^ evy and set IlLevy = (nLe vy (#)> & > 0). 

4.3. Levy sub-trees. 

4.3.1. Skeleton of finite real tree. Let t be a real tree with finite height and a finite number 
of leaves, such that the leaves (fi,i G F[t)) are indexed by a totally ordered set I(t). We 
define the skeleton t of the tree t as the discrete tree (belonging to T) where we forget the 
edge lengths. As the trees in T are ordered, we must be a bit more rigorous for the definition 
oft. 

The skeleton t of the real tree with ordered leaves (t, (fi,i G -f(t))) is defined recursively 
as follows. We define k$(t) as the degree of MRCA(/j,i G I(t)) the ancestor of all the 
leaves of t. If A;g (t) = 0, then t is reduced to 0. In this case t has one leaf, let / be its 
label, and the discrete tree t has thus one leaf to which we give the label /. If k$(t) > 0, 
then we consider the ^(t) connected components of t \ {MRCA(/j,i G I{t))} that do not 
contain the root and label them from 1 to k$(t) according to the lowest label of the leaves 
of t which belongs to them. This gives an ordered family {tk,k G {1, . . . , k^it)}) of real 
trees, and let MRCA(/j,i G /(£))}) be the root of each one. For k G {1, . . . , k$(t)}, let 
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Iv-k) = {i 6 fi G tfc) be the labels of the leaves of and the discrete tree t& is the 
skeleton of (t k , (fi,i G J(t&))). 

Notice that t is finite, k u (t) ^ 1 for all u G t, and t and t have the same number of leaves. 
In the previous construction to a leaf fi of t with label i corresponds a unique leaf e« of t 
with label i. For u G t, we define t u the sub-tree of t attached to the node u i.e. 

t u = {w G U, uv G t}, 

and let I u = {i;ei G t u }. Define t u as s n = t\ (J^^ [0, /j] to which we add the root 
0u = s^7\s U i and I(t u ) = {i; ej G t u }. Notice that by construction t u is the skeleton of 
£ We say that « £ t are the individuals of t, and define their lifetime as 

the length h u of the geodesic B(u) = [0 M , MRCA(/j, i G J(t u ))]. We say the corresponding 
node in t of u G t is C(u) = MRCA(/,,i G I(i u )). 

Notice it is easy to reconstruct t from t and the family of lifetime (h u ,u G t). 

4.3.2. Coalescence of Levy tree and GW tree. Let M be, under N or conditionally on 
T, a Poisson random variable with finite mean a = m(T). On {M > 1}, let To be the real 
sub-tree of T generated by the root and (Fi, 1 < i < M): 

f = |J [0,^]. 

l<i<M 

Since m has support C(T) and has no atom, we deduce that (Fi, 1 < i < M) are distinct 
and are the leaves of To- 

We denote by To the skeleton of To with the labeled leaves (Fi, 1 < i < M). According to 
[16], Theorem 3.2.1, the tree To is distributed under N[ • | M > 1] as a continuous GW tree 
(i.e. a GW tree with edge-lengths) such that 

• The discrete tree To is a GW tree with reproduction law characterized by its gener- 
ating function g defined by (|3|) with a = I/7. 

• Lifetimes of individuals (h u , u G To) are independent random variables with exponen- 
tial distribution with parameter 7. 

Remark 4.1. Using the scaling property of the Levy tree, we have that the distributions of 
f under N[ ■ \ M = n] and under . | M = n] are the same. 

We now consider the marks that define the pruned tree Tg and we denote by Tg the tree 
To pruned on the same marks, in other words, we set 

f e = f n T e . 

Let J vy be the restriction of IlLevy to the n first integer. By construction, if Ag is an element 
of n{^ vy , then there exists a leaf x of Tg such that x belongs to the sub-tree UieA 9 1®' ^il> ano - 
x is the only leaf of Tg with this property. We set Ag for the label of x, and we consider the 
order of the elements of nj^ vy given by the order of their smallest integer. We set Ig = I(Tg) 
for the labels of the leaves of Tg and (Ff ,i G Ig) for the leaves of Tg. 

We denote by Tg the skeleton of Tg with the labeled leaves (Ff,i G 1(0)). According to 
[5], Proposition 4.1, the tree Tg is distributed under N[ • | M > 1] as a continuous GW tree 
such that 

• Tg is a GW tree with offspring distribution characterized by its generating function 
gg given in ([8]) with = 1/7. 
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• The lifetimes of individuals (h u ,u G Tg) are independent random variable with expo- 
nential distribution with parameter tp'g(l) = 7(1 + #) 7 ~ 1 . 

Proposition 4.2. The process (Tg,9 > 0) is distributed under N[ • \ M > 1] as the process 
(V e (T),6 > 0) under P. 

Proof. Let 9 > 0. Theorem 6.1 of [8] describes how Tq is obtained from To: 

• A branching point x of To with k x = k x (To) children is marked at time t x with 
distribution given by: 

Njr. > e I fp] = - P + ^ + ^-( 1 ^ 



4>( k *)(i) ^(fe)(i) \1 + 

• A branch B of length h is marked at time tb with distribution given by: 

n[r B >6\f } = exp (-hj%"{\ + *)<fe) = e -(^'( 1 + )-^( 1 )) /l . 

Then the tree To is cut according to the marks present at time 9 and the tree Tg is the 
connected component that contains the root. Therefore, the tree Tg is obtained from the tree 
T by a pruning at node. A node u G T is marked if the corresponding node C(u) G T is 
marked at time # in the previous procedure OR the branch B(u) with length h u is marked. 
So the node u of To is marked at time ( u = tc( u ) A tq( u ) and using that the edge lengths of 
To are independent and exponentially distributed with parameter 7 = tp'(l), we have with 
K = k u (T ): 

n[c„ > e I T ] = n[t c{u) >e\f ] n[t b{u) >e\f ] 



1 



1 + 



k u --y 



-oc 







1 + 



Since the cutting time an d t b(u) are independent for all internal nodes u, we recover 

the discrete pruning procedure that defines the process (Vg(T), 6 > 0) under P. To conclude 
notice that To and T are GW tree with offspring distribution characterized by its generating 
function g. □ 

4.4. Proof Theorem 11.21 The next corollary states that the pruning procedure for stable 
GW tree developed in [7] and the pruning procedure for Levy trees developed in [I] and 
applied in [8] to sub-trees with finite number of leaves coincide. 

Corollary 4.3. Let n G N. The process (Tg, 9 > 0) is distributed under N[ • | M = n] as the 
process (Ve(T),6 > 0) under P n . 

Proof. This is a direct consequence of Proposition 14.21 and the fact that M = L(Tq). □ 

Theorem 11.21 follows directly from Theorem 11.11 and of the following corollary, which is a 
direct consequence of Corollary 14.31 Recall that n{™J vy is the restriction of IlLevy defined in 
Section [4.2.21 to the n first integers. 
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Corollary 4.4. The process tl^l yy is under N^ 1 ) distributed as f^Q W under P n . 

5. Proof of Proposition 11.41 

We recall results from |2 1 1. [24} I22j . Let X n be the number of coalescence events for a 
(3{a, 6)-coalescent. For 1 < a < 2 and b > 0, we have that: 

2 ~ a a-2 Y 

r(o) 

converges in distribution towards 

roo 

W a , b = dt e -^-a)S a , b (t) t 

Jo 

where S a b is a subordinator with Laplace exponent (f) a j given by: 



</>a,&(A) 



(l - (1 - x) A ) x a - 3 (l - x) 6 - 1 cte. 



Notice that this notation is consistent with (|14|) . Since Z n is distributed as X n with a = 1 + a 
and 6 = 1 — a. We deduce that: 



n «-i Zn Z, 

n— >+oo 



with Z distributed as 

Using Lemma l3.1[ we compute the moments of Z: 



E [W? +Qil _J = n! / E [e-* 1 - 



TT E r e ~ < - 1 ~ Q ' )A: ' s ' 1 ~ c "' :L+c " ( ' rfc - ) 
0<ri,-,0<r nfc=1 



dil • • • dt n 
drx--- dr n 



Ilfc=i 0i+Q,i-a(^( 1 - ")) 
1 - cA n T(n + l)r(l - a) 



We deduce that: 



r(a)7 r((n + l)(l-a)) 

r(n + i)r(i - a) 



E [Z n ] = a 7 



r((n + l)(l-a)) 



6. Number of blocks in the last coalescence event 

In] 

We consider the number of blocks B n involved in the last coalescence event of IT-'. In 
order to stress the dependence in n, we shall denote by T n the GW tree T under P n . We 
also write £ u (T n ) for £ u to stress the dependence of the marks introduced in Section [231 as 
a function of the underlying tree T n . Notice that the time £$(T n ) at which the root of T n is 
marked correspond to the last coalescence event associated with T n . Thanks to Theorem ll.il 
B n is distributed as the number of leaves of the pruned tree obtained from T n just before the 
last coalescence event, that is: 



(20) 



B n = L(V^ {Tn) _(T n )). 
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6.1. Local limit. The method used in [3] when a = 1/2 relies on the Aldous's CRT, which 
is the (global) limit of T n when the length of the branch of T n are rescaled by 1/ y/n, see [15] . 
Since Levy's trees are more difficult to handle, we choose here to use the local limit of T n , 
which is the Kesten's tree T*, according to [5]. 

Recall that v g is the distribution with generating function g given in ([3|) and that v g is 
critical as </(l) = 1. We recall the distribution of the Kesten's tree T* associated with the 
critical reproduction law v g , see [26]. Let v* be the corresponding size-biased distribution: 
v*{k) = kv g {k) for all k G N. For h G N, we consider the truncation operator r^ on T defined 
as: 

r/jt = {h 6 t; \u\ < h}. 

The distribution of T* is as follows. Almost surely, T* contains a unique infinite path i.e. 
a unique infinite sequence (Vk,k G N*) of positive integers such that, for every h G N, 
Vi ■ ■ ■ Vh G T*, with the convention that V\ ■ ■ ■ Vh = if h = 0. The joint distribution of 
(14, k G N*) and T* is determined recursively as follows: for each h €N, conditionally given 
(Vi, . . . , Vh) and r^T* , we have: 

• The number of children (k v (T*), v G T*, \v\ = h) are independent and distributed 
according to v g if v 7^ V\ • • • Vh and according to v* if v = V\ • • • Vh- 

• Given also the numbers of children (k v (T*), v G T*, \v\ = h), the vertex Vh+i is 
uniformly distributed on the set of integers |l, . . . , J2 v eT* \v\=h ^v(T*)^- 

Recall that the height of a discrete tree t G T is i? max (t) = sup{|u|,n G t}. For h G N*, 
we have that for all t G T with height h and u £ t with \u\ = h: 

F(r h T* =t,Vf-Vh = u) = P n (r h T n = t). 

The local limit convergence of critical GW trees, see [5], implies that, for all h G N*, t G T 
with height h: 

lim P n (r h T n =t) = ¥(r h T* = t). 

n— y+00 

For t G T, and it £ t, we consider the sub-tree t n attached at u defined by: 

t u = {w G U; uw G t}. 
By construction of the marks, we deduce the following result. 
Lemma 6.1. We have, for all t G T: 

lim P„(^ e( T B )-(r„) = t) = P(T = t), 

n— ^+00 BV ' 

where T is such that: 

• k$(T) has distribution u*. 

• Conditionally on k^T), £ is a random variable such that P(£ > 6) = (1 + #) 1_fc 0( T ) 
for all 6>0. 

• Conditionally on k$(T) and £, V\ is a uniform random variable on {1, . . . , /^(T)}. 

• Conditionally on k$(T), £ and V\, (T u ,u G {1, . . . , k$(T)}) are independent random 
trees distributed such that for u 7^ V\, T u is distributed as V^(T) with T a GW tree 
with reproduction law v g , and Ty x is distributed as V^{T*), with T* distributed as the 
Kesten's tree associated with the reproduction law u g . 

Notice that by construction, T is finite. 
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6.2. Proof of Proposition 11.51 We deduce from (|20p . Lemma 16.11 and the fact that T is 
a.s. finite, that B n converge in distribution to B = L(T). From Lemma 16.1} we have that B 
is distributed as 

k=l 

where has distribution u*, £ has density (k$ — 1)(1 + 6*) _A;0 l{g>o}, T* is independent and 
distributed as the Kesten's tree associated with u g , and (Tk,k G N*) are independent and 
distributed as a Galton- Watson tree T with reproduction law v g . We deduce that: 



E [r 



E 



N(N -1) 



(l + 9y N d9K[ 



E 



r L 9 



where N has distribution z/ g , Lg is the number of leaves of Vg(T) and is the number of 
leaves of Vg(T*). 

Let hg be the generating function of Lg and hn be the generating function of IA. We have: 



E [■ 



,-01 



+oo 



cZ0 



5 



Mr) 
1 + 6 



hg(r)h* (r). 



(1 + ^)2 

Recall that Vg{T) is a GW tree whose reproduction law has generating function gg given 
by ©. Similar arguments as in the proof of (fT3j) . yields that: 

(21) 

We deduce from (jHJ) that: 



We deduce from ([21]) that: 
(22) (l-^(^(r))) 
We obtain: 



&(/i*(r))-/i*(r) = «to(0)(l-r). 
W and ^(^«) = (l-^(^(r)))^))2- 



hg(r) 
1 + 6» / 1 + 



We now compute h%. According to Remark 3.7 in [8], we have for teT: 



P(V e (T*) = t) = L(t) 



P(Vg(T)=t). 



We deduce that: 



h* e (r) = E [r L s] =^r i Wp(? 9 (T*) = t) 



teT 



1-^(1) 
9g(0) 

h'Jr) 



J2mr m P(Vg(T)=t) 

teT 



where we used the first equality in ([22]) with r = 1 and fog(l) = 1. We get: 

^ +00 d# 5 ,(0) h>>(r) 



(23) 



E [r 



o 1 + 9 h' e (l) (h'Jr)f 



hg(r) 
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We have from (|SJ) that: 

g e (0) = a(l + 
We deduce from (|2T1) that: 



1 + 



l/a 



h e {r) = {l + 9) 



1 - < 1 - r 



1 + 



l/a 



Then, the change of variable x = 1 — (0/(1 + 6)) l l a in ([23]) gives that 99^, given in ©, is 
the generating function of B. 
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